You are viewing a preview of this lesson. Sign in to start learning
Back to Mastering Memory Management and Garbage Collection in .NET

Stackalloc and Span<T>

Stack-based buffer allocation for performance-critical paths

Stackalloc and Span

Master high-performance memory allocation with free flashcards and spaced repetition practice. This lesson covers stackalloc for stack-based allocation, Span<T> for safe memory access, and real-world optimization patternsβ€”essential concepts for building efficient .NET applications that minimize garbage collection pressure.

Welcome

Welcome to one of the most powerful yet underutilized features in modern .NET! πŸ’» If you've ever wondered how to write blazingly fast code that doesn't trigger constant garbage collections, you're in the right place. stackalloc and Span<T> represent a paradigm shift in how we think about memory in managed languagesβ€”giving you C-like performance with C#'s safety guarantees.

In this lesson, you'll learn when and how to allocate memory on the stack instead of the heap, how Span<T> provides a unified abstraction over different memory sources, and the critical safety rules that prevent memory corruption. By the end, you'll understand why these features are the secret sauce behind high-performance libraries like ASP.NET Core and System.Text.Json.

Core Concepts

Understanding Stack vs. Heap Allocation

Before diving into stackalloc, let's revisit the fundamental difference between stack and heap memory:

Characteristic Stack πŸ”Ί Heap πŸ“¦
Lifetime Scope-bound (method duration) GC-controlled (can outlive method)
Allocation Speed ⚑ Extremely fast (pointer bump) Slower (GC bookkeeping)
Deallocation Automatic (stack unwind) GC pauses required
Size Limit ~1MB (OS-dependent) Limited by available memory
Fragmentation None Can fragment over time

The Golden Rule: Stack allocation is perfect for small, short-lived buffers. Heap allocation is necessary for data that outlives the current method or exceeds safe stack size limits.

What is stackalloc?

stackalloc is a C# keyword that allocates memory directly on the call stack instead of the managed heap. This means:

βœ… Zero GC pressure - No garbage collection involved
βœ… Ultra-fast allocation - Just moving the stack pointer
βœ… Automatic cleanup - Memory freed when method returns
βœ… Cache-friendly - Stack memory is typically hot in CPU cache

⚠️ But with constraints:

❌ Must be small - Large allocations risk stack overflow
❌ Cannot escape scope - Cannot return or store in fields
❌ Cannot be used in async methods - Stack frames don't persist across await

Basic syntax:

// C# 7.2+: Must assign to Span<T>
Span<int> numbers = stackalloc int[100];

// Older unsafe code (not recommended)
unsafe
{
    int* ptr = stackalloc int[100];
}

πŸ’‘ Tip: Modern C# uses stackalloc with Span<T> to provide bounds checking and eliminate unsafe code!

What is Span?

Span<T> is a ref struct that provides a type-safe, memory-efficient view over contiguous memory regions. Think of it as a "window" that can look at:

  • Stack-allocated memory (stackalloc)
  • Heap-allocated arrays
  • Unmanaged memory
  • String internals (via ReadOnlySpan<char>)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         SPAN MEMORY ABSTRACTION          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                             β”‚
β”‚  πŸ”Ί Stack Memory   ←─────┐                 β”‚
β”‚  (stackalloc)            β”‚                 β”‚
β”‚                          β”‚                 β”‚
β”‚  πŸ“¦ Heap Array     ←───────── Span     β”‚
β”‚  (new T[])               β”‚    (unified     β”‚
β”‚                          β”‚     API)        β”‚
β”‚  πŸ’Ύ Unmanaged      ←──────                 β”‚
β”‚  (native alloc)          β”‚                 β”‚
β”‚                          β”‚                 β”‚
β”‚  πŸ“ String Slice   β†β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚  (AsSpan())                                 β”‚
β”‚                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key characteristics of Span:

  1. Ref struct - Cannot be boxed, cannot be a field in regular classes, cannot cross async boundaries
  2. Stack-only - Can only live on the stack, ensuring safety
  3. Zero-copy slicing - Creating sub-spans is free (just pointer + length)
  4. Bounds-checked - Prevents buffer overruns at runtime
  5. Performance - JIT treats it specially for optimal codegen
// Creating spans from different sources
int[] array = new int[100];
Span<int> fromArray = array.AsSpan();

Span<int> fromStack = stackalloc int[50];

Span<int> slice = fromArray.Slice(10, 20); // Zero-copy view of elements 10-29
Memory Layout: How Span Works

Under the hood, Span<T> is incredibly simple:

public readonly ref struct Span<T>
{
    private readonly ref T _reference;  // Pointer to first element
    private readonly int _length;       // Number of elements
    
    // ... methods ...
}

This means a Span<T> is just 16 bytes on 64-bit systems (8-byte pointer + 4-byte int + padding), regardless of how much memory it references!

MEMORY LAYOUT EXAMPLE

Stack Frame:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Span data                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ _reference: 0x7FFE1234    β”‚  β”‚  (8 bytes)
β”‚  β”‚ _length: 100              β”‚  β”‚  (4 bytes)
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                 β”‚
β”‚  int[100] buffer                β”‚
β”‚  β”Œβ”€β”¬β”€β”¬β”€β”¬β”€β”¬β”€β”¬β”€β”¬β”€β”€β”€β”¬β”€β”¬β”€β”¬β”€β”      β”‚
β”‚  β”‚0β”‚1β”‚2β”‚3β”‚4β”‚5β”‚...β”‚97β”‚98β”‚99β”‚     β”‚  (400 bytes)
β”‚  β””β”€β”΄β”€β”΄β”€β”΄β”€β”΄β”€β”΄β”€β”΄β”€β”€β”€β”΄β”€β”΄β”€β”΄β”€β”˜      β”‚
β”‚   ↑                             β”‚
β”‚   └─ _reference points here     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Safety Guarantees and Restrictions

The compiler enforces strict rules to prevent memory corruption:

1. Ref structs cannot escape the stack:

// ❌ COMPILER ERROR: Cannot be class field
public class MyClass
{
    private Span<int> _data; // Error CS8345
}

// ❌ COMPILER ERROR: Cannot be boxed
object boxed = mySpan; // Error

// ❌ COMPILER ERROR: Cannot use in async methods
async Task ProcessAsync()
{
    Span<int> data = stackalloc int[100]; // Error CS4012
    await Task.Delay(100);
}

2. Stack-allocated memory cannot outlive its scope:

// ❌ DANGEROUS: Would return dangling pointer
Span<int> GetNumbers()
{
    Span<int> numbers = stackalloc int[10];
    return numbers; // Error CS8352: Cannot return
}

// βœ… CORRECT: Process within scope
void ProcessNumbers()
{
    Span<int> numbers = stackalloc int[10];
    for (int i = 0; i < numbers.Length; i++)
        numbers[i] = i * 2;
    // Automatically freed when method returns
}

3. Size limits for stack allocation:

// βœ… SAFE: Small allocation (~400 bytes)
Span<int> small = stackalloc int[100];

// ⚠️ RISKY: Large allocation (40KB)
Span<int> large = stackalloc int[10_000]; // May cause StackOverflowException

// βœ… BETTER: Use threshold pattern
const int StackThreshold = 512; // bytes
int size = GetRequiredSize();

Span<byte> buffer = size <= StackThreshold
    ? stackalloc byte[size]
    : new byte[size];

πŸ’‘ Rule of Thumb: Keep stackalloc under 1KB (ideally under 512 bytes) to avoid stack overflow risks.

ReadOnlySpan for Immutable Views

When you need read-only access, use ReadOnlySpan<T>:

// Prevents accidental modification
ReadOnlySpan<char> text = "Hello, World!".AsSpan();

// ❌ COMPILER ERROR
text[0] = 'h'; // Error: Cannot modify ReadOnlySpan

// βœ… CORRECT: Read-only operations
bool startsWithH = text[0] == 'H';
ReadOnlySpan<char> hello = text.Slice(0, 5);

This is especially powerful for string processing without allocations:

// Traditional: Creates substring (heap allocation)
string text = "user@example.com";
string domain = text.Substring(text.IndexOf('@') + 1); // Allocates!

// Modern: Zero-allocation slicing
ReadOnlySpan<char> span = text.AsSpan();
int atIndex = span.IndexOf('@');
ReadOnlySpan<char> domainSpan = span.Slice(atIndex + 1); // No allocation!

Examples

Example 1: Fast Buffer Processing Without GC

Let's parse a CSV line without allocating temporary strings:

public static void ParseCsvLine(ReadOnlySpan<char> line)
{
    // Allocate small buffer for field indices on stack
    Span<int> commaPositions = stackalloc int[10]; // Max 10 fields
    int fieldCount = 0;
    
    // Find all comma positions
    for (int i = 0; i < line.Length && fieldCount < 10; i++)
    {
        if (line[i] == ',')
            commaPositions[fieldCount++] = i;
    }
    
    // Extract fields using zero-copy slicing
    int start = 0;
    for (int i = 0; i < fieldCount; i++)
    {
        ReadOnlySpan<char> field = line.Slice(start, commaPositions[i] - start);
        ProcessField(field);
        start = commaPositions[i] + 1;
    }
    
    // Process last field
    if (start < line.Length)
    {
        ReadOnlySpan<char> lastField = line.Slice(start);
        ProcessField(lastField);
    }
}

void ProcessField(ReadOnlySpan<char> field)
{
    // Parse without allocating strings
    if (int.TryParse(field, out int value))
        Console.WriteLine($"Integer: {value}");
}

// Usage
string csvLine = "42,hello,99,world";
ParseCsvLine(csvLine.AsSpan()); // Zero heap allocations!

Why this is fast:

  • stackalloc for temporary buffer - no GC
  • Slice() creates views without copying
  • TryParse has overloads accepting ReadOnlySpan<char>
  • Total heap allocations: zero
Example 2: Conditional Stack/Heap Allocation Pattern

This is a production-ready pattern used in .NET Core libraries:

public static string Base64Encode(ReadOnlySpan<byte> data)
{
    const int MaxStackAlloc = 256;
    
    int bufferSize = (data.Length * 4 + 2) / 3; // Base64 size calculation
    
    // Use stack for small data, heap for large
    Span<char> buffer = bufferSize <= MaxStackAlloc
        ? stackalloc char[bufferSize]
        : new char[bufferSize];
    
    // Convert to Base64 using the buffer
    bool success = Convert.TryToBase64Chars(data, buffer, out int charsWritten);
    
    if (!success)
        throw new InvalidOperationException("Encoding failed");
    
    // Return string from the buffer
    return new string(buffer.Slice(0, charsWritten));
}

// Usage
byte[] smallData = new byte[50];
string encoded1 = Base64Encode(smallData); // Uses stackalloc

byte[] largeData = new byte[1000];
string encoded2 = Base64Encode(largeData); // Uses heap allocation

Pattern breakdown:

Step Action Benefit
1 Calculate required buffer size Avoid over-allocation
2 Compare against threshold Safety check for stack size
3 Conditional allocation Optimize for common case
4 Use same Span API Code works for both paths
Example 3: Efficient String Manipulation

Replace parts of a string without intermediate allocations:

public static string ReplaceVowels(string input, char replacement)
{
    // Work on stack for small strings
    Span<char> buffer = input.Length <= 128
        ? stackalloc char[input.Length]
        : new char[input.Length];
    
    // Copy to mutable buffer
    input.AsSpan().CopyTo(buffer);
    
    // Modify in place
    for (int i = 0; i < buffer.Length; i++)
    {
        char c = char.ToLower(buffer[i]);
        if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u')
            buffer[i] = replacement;
    }
    
    return new string(buffer);
}

// Usage
string result = ReplaceVowels("Hello World", '*');
Console.WriteLine(result); // H*ll* W*rld

Performance comparison:

Traditional (StringBuilder):     β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 120ns, 240 bytes allocated
Stackalloc + Span:               β–ˆβ–ˆβ–ˆ 35ns, 0 bytes allocated (small strings)
                                 πŸ”Ί 3.4x faster, zero GC pressure
Example 4: Working with Binary Data

Parsing network packets efficiently:

public readonly struct PacketHeader
{
    public byte Version { get; init; }
    public ushort PacketId { get; init; }
    public int DataLength { get; init; }
}

public static PacketHeader ParseHeader(ReadOnlySpan<byte> data)
{
    // Validate minimum size
    if (data.Length < 7)
        throw new ArgumentException("Invalid packet size");
    
    // Parse fields without allocating objects
    return new PacketHeader
    {
        Version = data[0],
        PacketId = BitConverter.ToUInt16(data.Slice(1, 2)),
        DataLength = BitConverter.ToInt32(data.Slice(3, 4))
    };
}

// Usage with stack-allocated buffer
public static void ProcessPacket(Stream networkStream)
{
    Span<byte> headerBuffer = stackalloc byte[7];
    networkStream.Read(headerBuffer);
    
    PacketHeader header = ParseHeader(headerBuffer);
    
    Console.WriteLine($"Version: {header.Version}");
    Console.WriteLine($"Packet ID: {header.PacketId}");
    Console.WriteLine($"Data Length: {header.DataLength} bytes");
}

Why this pattern works:

  • Fixed-size header (7 bytes) - perfect for stackalloc
  • ReadOnlySpan<byte> prevents accidental modification
  • Slice() extracts fields without copying
  • Zero allocations for header parsing

Common Mistakes

❌ Mistake 1: Allocating Too Much on Stack
// DANGER: 40KB on stack - likely to crash!
Span<byte> hugeBuffer = stackalloc byte[40_000];

βœ… Fix: Use heap for large allocations

const int MaxStackBytes = 512;
int size = GetRequiredSize();

Span<byte> buffer = size <= MaxStackBytes
    ? stackalloc byte[size]
    : new byte[size];
❌ Mistake 2: Trying to Store Span in a Field
public class DataProcessor
{
    private Span<int> _buffer; // ERROR: Cannot be a field
}

βœ… Fix: Use Memory for storable references

public class DataProcessor
{
    private Memory<int> _buffer; // OK: Memory<T> can be stored
    
    public void Process()
    {
        Span<int> span = _buffer.Span; // Get Span when needed
        // ... work with span ...
    }
}
❌ Mistake 3: Using Stackalloc in Async Methods
public async Task ProcessAsync()
{
    Span<byte> buffer = stackalloc byte[100]; // ERROR CS4012
    await Task.Delay(100);
}

βœ… Fix: Use heap allocation in async context

public async Task ProcessAsync()
{
    byte[] buffer = new byte[100]; // Use array
    await Task.Delay(100);
    // Or use Memory<T> for async-friendly APIs
}
❌ Mistake 4: Returning Stack-Allocated Span
public Span<int> GetBuffer()
{
    Span<int> data = stackalloc int[10];
    return data; // ERROR: Would return dangling reference
}

βœ… Fix: Return array-backed Span or use output parameter

public Span<int> GetBuffer()
{
    int[] array = new int[10];
    return array.AsSpan(); // Safe: array lives on heap
}

// Or better: let caller provide buffer
public void FillBuffer(Span<int> buffer)
{
    for (int i = 0; i < buffer.Length; i++)
        buffer[i] = i * 2;
}
❌ Mistake 5: Forgetting Bounds Checking

While Span<T> has bounds checking, you still need to validate inputs:

public void ProcessData(Span<byte> data)
{
    // Crashes if data.Length < 4!
    int value = BitConverter.ToInt32(data);
}

βœ… Fix: Always validate size requirements

public void ProcessData(Span<byte> data)
{
    if (data.Length < 4)
        throw new ArgumentException("Buffer too small");
    
    int value = BitConverter.ToInt32(data.Slice(0, 4));
}

Performance Characteristics

Here's what you gain by using stackalloc and Span<T>:

Operation Traditional (Heap) Stackalloc + Span Improvement
Small buffer allocation (128 bytes) ~25ns + GC pressure ~2ns, no GC βœ… 12x faster
String substring ~40ns + allocation ~5ns, no allocation βœ… 8x faster
Array slicing Array.Copy required Span.Slice (free) βœ… Zero-copy
CSV parsing (10 fields) ~500ns + strings ~80ns, no strings βœ… 6x faster

Real-world impact: ASP.NET Core uses these patterns extensively, contributing to its 10x+ performance improvements over older frameworks.

When to Use Each Tool

🎯 Decision Matrix

Scenario Best Choice Why
Small temporary buffer (<512 bytes) stackalloc Zero GC, ultra-fast
Size unknown at compile time Conditional (threshold pattern) Safe + optimized
Need to store in field/property Memory<T> Span cannot be stored
Async method Memory<T> or array Stackalloc not allowed
String manipulation (read-only) ReadOnlySpan<char> Zero-copy slicing
Binary protocol parsing stackalloc + Span<byte> Perfect fit
Large data processing (>1KB) Array + Span<T> view Stack overflow risk

Advanced Topics Preview

πŸ” Memory vs Span:

  • Memory<T> is the "storable" version of Span<T>
  • Can be fields, can cross async boundaries
  • .Span property converts to Span<T> when needed
  • Slightly more overhead (24 bytes vs 16 bytes)

πŸ” MemoryMarshal for Advanced Scenarios:

// Reinterpret cast (unsafe but fast)
Span<byte> bytes = stackalloc byte[8];
Span<long> longs = MemoryMarshal.Cast<byte, long>(bytes);
longs[0] = 42; // Writes 8 bytes

πŸ” ArrayPool Integration:

int[] rented = ArrayPool<int>.Shared.Rent(1000);
Span<int> span = rented.AsSpan(0, 1000);
// ... use span ...
ArrayPool<int>.Shared.Return(rented);

Key Takeaways

βœ… stackalloc allocates memory on the stack - blazingly fast, zero GC pressure, but limited in size and scope

βœ… Span<T> provides a unified, safe API over any contiguous memory - stack, heap, or native

βœ… Ref structs like Span<T> are compiler-enforced to stay on stack, preventing memory corruption

βœ… Zero-copy operations with Slice() make string and array manipulation incredibly efficient

βœ… Threshold pattern (stackalloc for small, heap for large) combines safety with performance

βœ… ReadOnlySpan prevents accidental mutations and clearly expresses intent

⚠️ Never allocate more than ~512 bytes with stackalloc to avoid stack overflow

⚠️ Cannot use in async methods, as fields, or return from methods (for stack-allocated data)

πŸ’‘ Pro tip: The .NET runtime team uses these patterns everywhere in BCL - study System.Text.Json, System.IO.Pipelines, and ASP.NET Core source code for production examples!

Quick Reference Card

πŸ“‹ Stackalloc & Span Cheat Sheet

Allocation Syntax:

Span<T> buffer = stackalloc T[size];              // Stack
Span<T> buffer = new T[size];                     // Heap
Span<T> buffer = size <= 512 ? stackalloc T[size] : new T[size]; // Conditional

Creating Spans:

Span<int> fromArray = array.AsSpan();
Span<int> slice = span.Slice(start, length);
ReadOnlySpan<char> text = "Hello".AsSpan();

Common Operations:

span[index]                    // Index access
span.Length                    // Size
span.Slice(start, length)      // Sub-view (zero-copy)
span.Clear()                   // Zero out
source.CopyTo(destination)     // Copy data
span.Fill(value)               // Fill with value

Limitations:

  • ❌ Cannot be fields in classes
  • ❌ Cannot be boxed to object
  • ❌ Cannot use in async methods (use Memory)
  • ❌ Cannot return stack-allocated spans
  • ⚠️ Keep stackalloc under 512 bytes

Safety Rules:

  • βœ… Bounds checking always active
  • βœ… Compiler prevents escaping stack
  • βœ… Cannot outlive source memory
  • βœ… ref struct prevents heap allocation

Performance Wins:

  • πŸ”Ί ~10-12x faster allocation
  • πŸ“¦ Zero GC pressure
  • ⚑ Cache-friendly memory access
  • 🎯 Zero-copy slicing and viewing

πŸ“š Further Study

  1. Microsoft Docs - Memory and Span Usage Guidelines
    https://learn.microsoft.com/en-us/dotnet/standard/memory-and-spans/
    Official documentation covering best practices, performance characteristics, and API reference.

  2. Adam Sitnik - Span Deep Dive
    https://adamsitnik.com/Span/
    Detailed technical explanation of how Span works internally, with benchmarks and real-world examples.

  3. Stephen Toub - Performance Improvements in .NET
    https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-8/
    Annual blog posts showing how Microsoft uses Span and stackalloc throughout the framework.


πŸŽ“ You now have the knowledge to write high-performance .NET code that rivals native languages! Practice these patterns in hot paths of your applications, and watch your allocation rates drop to near-zero. Remember: measure first, optimize second - but when you need speed, stackalloc and Span<T> are your secret weapons. Happy optimizing! πŸš€